Detecting Fraud in Mobile Telephony Using Neural Networks
نویسندگان
چکیده
Our work focuses on: the problem of detecting unusual changes of consumption in mobile phone users, the corresponding building of data structures which represent the recent and historic users’ behaviour bearing in mind the information included in a call, and the complexity of the construction of a function with so many variables where the parameterization is not always known. 1 Description of the Problem The existing systems of fraud detection try to consult sequences of CDR’s (Call Detail Records) by comparing any field function with fixed criteria known as Triggers. A trigger, when activated, sends an alarm which leads to fraud analysts’ investigation. These systems make what are known as a CDR’s absolute analysis and they are used to detect the extremes of fraudulent activity. To make a differential analysis, patterns of behavior of the mobile phone are monitored by comparing the most recent activities to the historic use of the phone; a change in the pattern of behavior is a suspicious characteristic of a fraudulent act. In order to build a system of fraud detection based on a differential analysis it is necessary to bear in mind different problems: (a) the problem of building and maintaining “users’ profiles” and (b) the problem of detecting changes in behavior. Pointing the first problem, in a system of differential fraud detection, information about the history together with samples of the most recent activities is necessary. An initial attempt to solve the problem could be to extract and encode Call Detail Records (CDR) information and store it in a given format of record. To do this, two types of records are needed; one, which we shall call CUP (Current User Profile) to store the most recent information, and another, to be called UPH (User Profile History) with the historic information [1, 2]. When a new CDR of a certain user arrives in order to be processed, the oldest arrival of the UPH record should be discarded and the oldest arrival of the CUP should enter the UPH. Therefore, this new, encoded record should enter CUP. It is necessary to find a way to “classify” these calls into groups or prototypes where each call must belong to a unique group. For the second problem, once the encoded image of the recent and historic consumption of each user is built, it is necessary to find the way to analyze this information so that it detects any anomaly in the consumption and so triggers the corresponding alarm. 614 H. Grosser, P. Britos, and R. García-Martínez 2 Description of the Suggested Solution In order to process the CDR’s, a new format of record must be created containing the following information: IMSI (International Mobile Subscriber Identity), date, time, duration and type of call (LOC: local call, NAT: national call, INT: international call). For constructing and maintaining the “user’s profiles”, we have to fix the patterns that will make up each of the profiles. The patterns must have information about the user’s consumption. We propose the use of SOM (Self Organizing Map) networks to generate patterns (creating resemblance groups) to represent LOC, NAT, and INT calls respectively [3]. The user’s profile is built using the patterns generated by the three networks. The data used to represent a pattern are the time of the call and its duration. The procedure to fill the patterns consists of taking the call to be analyzed, encoding it and letting the neural network decide which pattern it resembles. After getting this information, the CUP user profile must be adapted in such a way that the distribution of frequency shows that the user now has a higher chance of making this type of calls. Knowing that a user’s profile has K patterns that are made up of L patterns LOC, N patterns NAT and I patterns INT, we can build a profile that is representative of the processed call and then adapt the CUP profile to that call. If the call is LOC, the N patterns NAT and the I patterns INT will have a distribution of frequency equal to 0, and the K patterns LOC will have a distribution of frequency given by the equation ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ ∑ = − − − − = L j j Q X e j Q X e i v 1 / [2] where X is the encoded call to be processed; v is the probability that X call could be i pattern and Qi is the pattern i generated by the neural LOC network. If the call were NAT, then L must be replaced by N and the distribution of LOC and INT frequencies will be 0; if the call were INT, then L must be replaced by I and the distribution of LOC and NAT frequencied will be 0. The CUP and UPH profiles are compared using the Hellinger distance [3] in order to settle whether there have been changes in the pattern of behavior or not. The value of distance will establish how different must CUP and UPH be, in order to set an alarm going. By changing this value, there will be more or fewer alarms set off.
منابع مشابه
Providing a Model for Detecting Tax Fraud Based on the Personality Types of Corporate Financial Managers using the Neural Network Approach
One of the management measures to reduce tax liabilities is non-payment of taxes through tax fraud. Because personality factors may play a role in explaining tax ethics, examining personality traits and aspects of tax fraud can help to better understand the factors that influence tax decisions. The main purpose of this study is to provide a model for detecting tax fraud based on the personality...
متن کاملUser profiling and classification for fraud detection in mobile communications networks
The topic of this thesis is fraud detection in mobile communications networks by means of user profiling and classification techniques. The goal is to first identify relevant user groups based on call data and then to assign a user to a relevant group. Fraud may be defined as a dishonest or illegal use of services, with the intention to avoid service charges. Fraud detection is an important app...
متن کاملDetecting Telecommunication Fraud using Neural Networks through Data Mining
-Neural computing refers to a pattern recognition methodology for machine learning. The resulting model from neural computing is often called an artif icial neural network (ANN) or a neural network. Neural networks have been used in many business applications for pattern recognition, forecasting, prediction and classif ication. Neural network computing is a key component for any data mining too...
متن کاملA hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements
Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...
متن کاملPredicting financial statement fraud using fuzzy neural networks
Fraud is a common phenomenon in business, and according to Section 24 of the Iranian Auditing Standards, it is the fraudulent act of one or more managers, employees, or third parties to derive unfair advantage and any intentional or unlawful conduct. Financial statements are a means of transmitting confidential management information about the<br ...
متن کاملFraud Detection in Mobile Communications Using Supervised Neural Networks 1 Fraud Detection in Mobile Communications Using Supervised Neural Networks
We present the results of the development of the rst prototype of a supervised neural network for the detection of fraud in mobile communications. We have developed this prototype in the framework of a project of the European Commission on Advanced Security for Personal Communications (ASPeCT) 1 , together with two other prototypes based on unsupervised neural networks and knowledge-based syste...
متن کامل